Exploitation of the cellulase enzyme complex of neurospora

ABSTRACT

The invention relates to the gene encoding the enzyme cellobiohydrolase-1. Specifically, the invention concerns the elucidation of the regulatable promoter sequence of said gene and the subsequent genetic manipulation of said sequence so as to combine it with DNA sequence structure of a heterologous peptide in order to provide for selective expression of heterologous peptide in accordance with the expression features of the promoter.

The invention relates to a method and recombinant means particularly, but not exclusively, expression cassettes and expression/export cassettes for the production of heterologous peptides and the enhanced production of cellulases especially cellobiohydrolase-1. The method and means have particular application in the production of such peptides and enzymes from the biotechnological exploitation of filamentous fungi and particularly Neurospora crassa.

The most abundant cell-wall and structural polysaccharide in the plant world is cellulose. Cellulose is a linear polymer of D-glucose arranged in a Beta 1-4 linkage. Cellulose is a major component of wood and thus of paper, it is also a major component of cotton and other plant materials.

On complete hydrolysis, cellulose is broken down to D-glucose, but partial hydrolysis yields a reducing disaccharide cellobiose in which the linkage between the D-glucose units is a glycosidic Beta 1-4 arrangement. Enzymes capable of hydrolysing cellulose are not secreted in the digestive tract of most mammals and therefore cellulose is not a source of food. However, ruminants can use celluloses as food because in the rumen of their stomachs they house bacteria which produce the enzyme cellulase.

As fossil fuel reserves become depleted, a renewable feed-stock for the chemical industry becomes more significant The obvious renewable resource is cellulose, which is already in embarrassingly large supply and largely wasted. However, the conversion of cellulose to a more readily utilisable substance such as sugars and alcohols is problematical.

Neurospora crassa grows well on cellulosic substrates. In doing so, it secretes enzymes of the cellulase complex, hydrolysing the substrate outside the cell. The resulting soluble sugars may be recovered before they are taken up by the cell and further metabolised. The amount of cellulose/cellobiose typically required to activate gene expression is represented by 1-2% by weight of cellulose/cellobiose.

The cellobiohydrolase-1 enzyme of Neurospora crassa is the major enzyme in the cellulase complex, and one of the major exported proteins of the organism when induced by cellulose or cellobiose (the product of partial hydrolysis of cellulose). Furthermore, Neurospora crassa is a very efficient cellulolytic species, able to hydrolyse cellulose efficiently, and grow on it as the sole carbon source. We have grown it on a range of cellulosic substrates, including pressed-sugar-beet pulp, cereal straw and spent malted grains from breweries. Indeed, Neurospora has been isolated in the wild from burnt sugar cane, and so is likely to grow well on bagasse from sugar cane processing. In addition, it grows very well on starch and a wide range of soluble sugars. Its nitrogen requirement for growth is readily satisfied by the supply of any one of a wide range of nitrogen sources, including protein, amino acids, ammonium ions, nitrate, nitrite, and urea. Its only complex biochemical requirement is for trace amounts of the vitamin biotin.

It follows from the above that genetic manipulation of the gene encoding the promoter and associated enzyme sequence structure for the enzyme cellobiohydrolase-1 will enable us to do a number of things, namely:

a) Increase the level of cellobiohydrolase-1 enzyme either by increasing the copy number of the cbh-1 gene or increasing the strength of the promoter of the cbh-1 gene. Both of these ways could be achieved by transforming in either additional copies or an altered copy of the gene, or possibly both. Thus for example, one could produce by transformation multiple copies of the gene encoding the cellobiohydrolase-1 enzyme so as to increase the level of cellobiohydrolase-1 production.

b) Alternatively, one could, by further manipulation, increase the strength of the cellobiohydrolase promoter thus increasing the level of cellobiohydrolase-1 enzyme.

c) Attach a suitable heterologous gene to the cellobiohydrolase promoter thus ensuring that such gene is transcribed by cellulose- or cellobiose-induction and at the high rate that the enzyme cellobiohydrolase would normally be produced, resulting in the production of high levels of the heterologous gene product

The expression constructs are of the following types;

1) A transcriptional fusion, including the cbh-1 promoter and regulatory sequences upstream from a multiple cloning site, to allow the construction of transcriptional fusions with the coding sequence of any desired heterologous peptide. Such production would be intracellular, requiring subsequent purification of the product from the cell extract.

2) A translational fusion, including the cbh-1 promoter an export signal peptide in transitional fusions (in all three possible reading frames) with the coding sequence of the desired heterologous peptide.

3) A translational fusion of a heterologous peptide near the C-terminus of the cbh-1 gene, with a proteolytic cleavage site in a linking region to allow subsequent cleavage of the heterologous peptide from the cbh-1. This would exploit the dispensable hinge and cellulose-binding domain cbh-1 replacing this region with the other peptide.

There is a further advantage to be gained from manipulating the gene encoding the cellobiohydrolase-1 enzyme in that the c-terminal end of the enzyme comprises a cellulose-binding domain. Genetic manipulation such that this domain is spliced onto any enzyme would confer on such enzyme cellulose-binding properties. This could be exploited in at least two ways. Firstly, the cellulose-binding domain could be used to immobilise a chosen heterologous protein onto a cellulose matrix. This would in turn facilitate biocatalysis by subsequently exposing the matrix to an appropriate substrate. Secondly, the use of the cellulose-binding domain could be exploited for purification means. For example, any desired heterologous protein having attached thereto a cellulose-binding domain could be bound to a cellulose matrix during the process of purification. Further, the process of purification could be taken one step further by specific protolytic cleavage at a site between the desired protein and the cellulose-binding domain so releasing the desired protein but leaving the cellulose-binding domain attached to the matrix. The cellulose-binding domain of Neurospora crassa is suitable for this type of purification because it is known to be a relatively small and efficient binding domain.

There are at least two possible genetic constructs:

1) a cloning construct with a cloning site in or n-terminal to the hinge region that is the region between the heterologous catalytic domain and the cellulose-binding domain. It is possible to insert coding sequence for a heterologous peptide in a translational fusion immediately upstream from the hinge and cellulose-binding domain. With a suitable promoter, expression of the fusion protein can be achieved. If the heterologous peptide Is an enzyme, the fusion protein with this enzyme activity can be immobilised by allowing it to attach to a cellulose matrix. The enzyme substrate can then be passed over the immobilised enzyme and the product produced.

2) an extension of the above with a specific proteolytic cleavage site constructed in the (hinge) region between the heterologous catalytic domain and the cellulose-binding domain. This would permit a simple purification by specifically binding the fusion protein to the cellulose matrix while washing all others off, and then specifically cleaving with the protease to release the heterologous moiety of the fusion protein from the still-bound cellulose-binding domain.

The term heterologous gene expression and heterologous protein is used in this document to mean the expression of proteins not present or common in the host.

Ideally the technology of the invention will be used to produce mammalian peptide hormones or any protein of pharmaceutical significance.

The genus Neurospora has several advantages for study with a view to its possible exploitation as a host for heterologous gene expression or enhanced cellulose production. These advantages are documented in copending application number PCT/GB 94/01789.

Here we report the DNA sequence of the cellobiohydrolase-1 gene, cbh-1, of Neurospora crassa together with flanking sequences and compare its amino acid sequence with other cellobiohydrolase-1 genes emanating from different organisms.

A full understanding of this gene, cbh-1, has enabled us to genetically engineer expression cassettes and expression/export cassettes containing high level, regulated promoter along with any other pre-selected gene sequence. The control of production of this gene sequence is in accordance with the repression induction features of the promoter. Thus we can selectively control the production of the said gene sequence according to the presence or absence of cellulose or cellobiose.

It is apparent that this technology has great significance in the genetic engineering industry because it enables selected production of a pre-determined peptide in an extremely efficient and cost effective way without the production of secondary metabolites. Further, since Neurospora, like other filamentous ascomycete fungi, but unlike yeast, tends to glycosylate proteins in a way resembling that of mammals, there is a reasonable expectation that any heterologously produced mammalian peptide hormone sequences requiring glycosylation for biological activity will in fact be biologically active.

Further, since the cellobiohydrolase-1 enzyme has a cellulose-binding domain, it follows the pre-selected gene sequences which are attached to the promoter can be so engineered that they are also attached to the said cellulose-binding domain thus conferring cellulose-binding properties on the pre-selected peptide corresponding to the pre-selected gene sequence. This cellulose-binding property can be used during biocatalyis to ensure that the relevant enzyme is attached to cellulose matrix prior to the introduction of its relevant substrate. Further, cellulose-binding domain can be used as indicated above during purification procedures.

It is therefore an object of the invention to provide methods and means for facilitating the enhanced breakdown of cellulose, conferring cellulose binding properties on pre-selected heterologous peptides and providing a system for the efficient cellulose/cellobiose-induction of heterologous proteins.

According to a first aspect of the invention there is provided a regulated promoter having the DNA sequence structure shown in FIG. 1 (SEQ ID NO:1), or part thereof, or a functionally equivalent nucleotide sequence.

According to a second aspect of the invention there is provided a regulated promoter and an upstream activator having the DNA sequence structure shown in FIG. 1 (SEQ ID NO:1), or part thereof.

Preferably said DNA sequence structure encodes a protein, the amino acid sequence of which is depicted in FIG. 1 (SEQ ID NO:2) or a protein of equivalent biological activity having substantially the amino acid sequence depicted in FIG. 1 (SEQ ID NO:2).

According to a third aspect of the invention there is provided a regulated promoter as aforedescribed which is further provided with linkers whereby ligation of the promoter with a pre-selected gene encoding a desired protein is facilitated.

According to a fourth aspect of the invention there is provided a regulator promoter as aforedescribed which is further provided with linkers whereby ligation of the promoter with a pre-selected gene encoding a desired protein is facilitated and also a signal sequence to facilitate export of said desired protein from its host cell.

Preferably said linkers include restriction sites or enzyme recognition sites to facilitate subsequent cleavage.

According to a fifth aspect of the invention there is provided DNA sequence structure shown in FIG. 1 (SEQ ID NO:1), or part thereof, or a functionally equivalent nucleotide sequence, which also includes cloning sites and processing sites which preferably are located at the c-terminal cellulose binding domain of the enzyme.

According to a further aspect of the invention there is provided a vector or plasmid incorporating the aforementioned DNA sequence structures.

According to a yet further aspect of the invention there is provided an expression cassette including at least the aforementioned regulated DNA promoter sequence plus a linker.

Preferably the expression cassette also includes the upstream activator sequence.

Preferably further still said linker can be subsequently cleaved.

Preferably further still the expression cassette also includes the sequence structure encoding the cellulose-binding domain.

Preferably further still said expression cassette includes a Neurospora selectable marker.

Preferably further still said expression cassette contains a replication origin from, ideally, E. Coli and preferably also an E. Coli selectable marker, for example, a gene encoding ampicillin-resistance.

Preferably further still said expression cassette incorporates a multiple cloning site whereby insertion of any pre-selected gene sequence, homologous or heterologous, can be incorporated via transcriptional transfusion.

According to a yet further aspect of the invention there is provided an expression/export cassette which incorporates any one or combination of the aforementioned expression features and which further incorporates the DNA sequence structure encoding a secretion signal.

Preferably said expression/export cassette contains the aforementioned DNA sequence translationally fused to the coding sequence of the heterologous peptide.

Preferably three different expression/export cassettes would be constructed. The multiple cloning site oligonucleotide is in a different reading frame in each to permit in-frame translational fusion to the coding sequence for the heterologous peptide. This is achieved by appropriate design of the ends of the synthetic multiple cloning site oligonucleotide.

Although the provision of an expression/export cassette is advantageous in that it enables a heterologous peptide to be both expressed and then exported, it is limiting because only those peptides which are susceptible to secretion can be made in this way.

According to a yet further aspect of the invention there is provided a method for transforming filamentous fungus, and particularly Neurospora crassa comprising the insertion of at least one of the aforementioned expression cassettes and/or expression/export cassettes into same using recombinant techniques.

According to a yet further aspect of the invention there is provided a filamentous fungus including at least one expression cassette and/or expression/export cassette according to the invention.

Preferably said filamentous fungus is Neurospora crassa.

According to a yet further aspect of the invention there is provided a method for the production of pro-selected heterologous peptide from at least one filamentous fungus comprising:

a) providing either an expression cassette or expression/export cassette as aforedescribed;

b) transforming a pre-selected species of filamentous fungus with at least one of said cassettes;

c) culturing said transformed fungus; and

d) harvesting said heterologous peptide.

It will be apparent from the above that when using the cbh-1 promoter, and the heterologous peptide production systems are grown on cellulose both good growth and induction of the expression of the induction system can be achieved.

According to a yet further aspect of the invention there is provided a protein having a cellulose-binding domain engineered in accordance with the invention.

According to a yet further aspect of the invention there is provided a filamentous fungus ideally Neurospora crassa capable of producing enhanced levels of cellulase enzyme complex in response to the presence of cellulose or cellobiose, which enhanced production is as a result of an increase in the copy number of the cellobiohydrolase gene or an increase in the strength of the associated promoter for the gene.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described by way of example only with reference to the following Figures wherein;

FIGS. 1A-1D represents the nucleic acid (SEQ ID NO:1) and protein (SEQ ID NO:2) sequences of the cellobiohydrolase-1 Neurospora crassa. Some restriction enzyme sites and PCR primers are indicated. Important consensus sequences are underlined. Putative N-glycosylation sites are indicated by an asterisk. Intron sequence and non-coding sequences are in lower case;

FIGS. 2A-2B represents the restriction map of the Neurospora crassa cbh-1 clone and in particular the restriction enzyme mapping of clone X (FIG. 2A), and the detailed restriction mapping of the region binding to the probes (FIG. 2B). The region binding to the probes is indicated in bold. P1 and P2 are PCR primers used for the amplification of the 0.8 Kb fragment ATG is the start codon of the cbh-1 gene. TAA is the stop codon of the cbh-1 gene. E=EcoR1, B=BamH1, H=Hind111, S=Sa11, P=Pst1, X=Xho1, K=Kpn1, C=Cla1. All sizes are in Kb.

FIGS. 3A-3C represents alignment of the Neurospora crassa cbh-1 protein sequence with related fungal cellulases. NCRX, HGRX, TRRX, TRVX, PHCX represent cbh-1 protein sequences of Neurospora crassa (SEQ ID NO:2), H. grisea (SEQ ID NO:3), T. reesei (SEQ ID NO:4), T. viride (SEQ ID NO:5) and P. chrysosporium (SEQ ID NO:7) respectively. TRRN represents the T. reesei EG 1 protein sequence (SEQ ID NO:6);

FIGS. 4A-4B represents the strategy used for the generation of plasmids; and

FIG. 5 represents the sequencing strategy of the cellobiohydrolase-1 gene of Neurospora crassa.

Cloning of the Cellobiohydrolase-1 Gene

The Neurospora cellobiohydrolase-1 gene, cbh-1, was cloned and sequenced using conventional methods such as sequence alignment of the gene from other species, design of nested PCR primers, and production of a fragment by PCR which was used to identify a genomic clone from a Neurospora genomic library in the vector lambda J1. The clone was sub-cloned into pBluescript, and sequenced by the dideoxy method.

The gene encodes a protein (see FIG. 1 (SEQ ID NO:2)) of approximately 550 amino acids including an export signal sequence between amino acids 1-28 approximately, a catalytic domain between amino acids 29-470 approximately, a hinge region between amino acids 471-519 approximately and a c-terminal cellulose-binding domain between amino acids 520-550.

FIG. 3 shows an alignment of the cbh-1 amino acid sequence of the gene from Neurospora crassa when compared to corresponding genes from other organisms, from top to bottom, the sequence is as follows N. crassa, H. grisea, T. reesei, T. viride, P. chrysosporium.

Restriction Enzyme Mapping of Clone

The restriction enzyme mapping of the clone is shown in FIG. 2. Moreover, in FIG. 4 the strategy for the generation of a new plasmid (p2.1 K) is shown towards the upper part of the Figure, and towards the lower part of the Figure the strategy for the generation of a set of new plasmids used during sequencing is shown.

In FIG. 5 the sequencing strategy of the cellobiohydrolase-1 gene of N. crassa is shown.

Choice of Reporter Gene

Two obvious choices of reporter gene exist. The first of these is the well-characterised GUS β-glucuronidase) reporter gene available in the plasmid pNom123. This has the hph hygromycin-resistance gene as its Neurospora-selectable marker. An alternative reporter gene is the Neurospora tyr tyrosinase construct pTyr103 obtained from Dr S Free, SUNY, Buffalo. Another alternative reporter is pho-2 (acid phosphatase) of Neurospora crassa.

Isolation of the Essential Sequence of the Promoter

Experimental investigation of the limits of the essential promoter were undertaken by the cleavage of the sub-cloned promoter-reporter gene construct, and the deletion in from the 5'-end of the sub-clone. This involves either deletion of specific restriction fragments, subject to available restriction sites, or exonuclease degradation. In either case, the shortened "promoter" is relegated into the reporter construct and tested for residual promoter activity and regulation.

Experimental investigation of the limits of the essential promoter were undertaken by the cleavage of the sub-cloned promoter-reporter gene construct and the deletion in from the 5'-end of the sub-clone. This was done using mung bean exonuclease digestion. Alternatively, it could be done using any suitable restriction sites so as to provide a nested set of deletions. These deletions, or shortened promoter sequences, were relegated into a reporter construct and tested for residual promoter activity and regulation.

Transformation into Neurospora

Standard transformation methodology was used to effect the transformation of DNA constructs into Neurospora spheroplasts, using the cell wall-grading enzyme Novozym234 (Radford et al 1981! Molec Gen Genet 184, 567-569).

Selection of Transformants

Transformants were selected for pNom123 (the GUS reporter gene) by initial selection for hygromycin-resistance. Expression of the GUS activity was detected in a subsequent step by the development of blue color on X-gluc substrate.

With pTyr103, the derived plasmids with putative promoter inserts have no independent selectable marker. They were co-transformed with a second plasmid with a selectable marker, a process which gives circa 50% co-integration of the unselected plasmid. Although a number of co-selectable plasmids are suitable, an example would be pFB6 (Buxton and Radford 1984! Molec Gene Genet 190, 403-405), containing the cloned pyri-4 gene of Neurospora, selecting transformants by complementation of a pyrimidine-requiring recipient strain. Transformants thus selected demonstrated promoter activity from the cbh-1 promoter region by expression of tyrosinase activity in vegetative culture, tyrosinase only normally being active in the sexual phase of the life cycle. Tyrosinase activity is again detected colourimetrically, by the conversion of supplied L-tyrosine to black melanin pigment, or of L-DOPA to a soluble red pigment.

The red colour from L-DOPA, and the blue colour from X-gluc are both quantitatively assayable.

Isolation of the Essential Sequence of the Cellulose Binding Domain

Experimental investigation of the limits of the essential cellulose-binding domain were undertaken by the cleavage of the sub-clone cellulose-binding gene construct, and the deletion in from the 3'-end of the sub-clone. This involves either deletion of specific restriction fragments, subject to available restriction sites or exonuclease degradation. In either case, the shortened cellulose-binding domain is relegated into the reporter construct and tested for residual cellulose-binding activity.

Construction of an Expression Cassette

The expression constructs are of three types;

a) a transcriptional fusion, including the cbh-1 promoter and regulatory sequences upstream from a multi-cloning site, to allow the construction of transcriptional fusion with a coding sequence of any desired heterologous peptide. Such production would be intracellular, requiring subsequent purification of the product from a cell extract;

b) a translational fusion, including the cbh-1 promoter and export signal peptide in translational fusions, in all three possible reading frames, with the coding sequence of the desired heterologous peptide;

c) a translational fusion of a heterologous peptide near the c-terminal of the cbh- 1 with a proteolytic site in a linking region to allow subsequent cleavage of the heterologous peptide from the cbh-1. This would exploit the dispensable hinge and cellulose-binding domain of cbh-1, replacing this region with the other peptide.

An expression cassette developed using a cbh-1 promoter contains a replication origin from E. coli, an E. coli-selectable marker such as ampicillin-resistance, a Neurospora-selectable marker such as hygromycin-resistance, and the cbh-1 promoter/regulatory region upstream from a multi-cloning site. Such a construct was amplified in E. coli, transformed into Neurospora crassa, and used to express the inserted coding sequence in the Neurospora crassa mycalium, under cellulose induction if the promoter still had regulated expression.

Construction of an Expression Export Cassette

This construct was in many ways similar to the above expression cassette except in that it further incorporates a DNA sequence structure encoding a secretion signal. Moreover, it was a translational fusion, containing the cbh-1 signal sequence translationally fused to the n-terminal region of the coding sequence from the heterologous peptide. Because of the necessity to retain the function of the signal sequence and also a common reading frame through the fusion, three different constructs were required. In each case, the MCS was in a different reading frame, achieved by appropriate design of the ends of the synthetic MCS. Furthermore, care was needed in the design of the MCS to ensure the function of the export signal and the recognition and cleavage of the signal from the mature product in the process of maturation and secretion.

In this case, the heterologous product was both expressed and exported into the culture medium. This limits the range of peptides which can be made, but facilitates the purification of those compatible with this production method.

    __________________________________________________________________________     #             SEQUENCE LISTING     - (1) GENERAL INFORMATION:     -    (iii) NUMBER OF SEQUENCES: 7     - (2) INFORMATION FOR SEQ ID NO: 1:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 1849 base               (B) TYPE: nucleic acid               (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA (genomic)     -     (vi) ORIGINAL SOURCE:     #crassa   (A) ORGANISM: Neurospora     #74A      (B) STRAIN: Oak Ridge     -    (vii) IMMEDIATE SOURCE:               (A) LIBRARY: lambda J1               (B) CLONE: X     -     (ix) FEATURE:               (A) NAME/KEY: CDS     #892..1758)B) LOCATION:join(152..832,     -     (ix) FEATURE:               (A) NAME/KEY: intron               (B) LOCATION:833..891     -     (ix) FEATURE:               (A) NAME/KEY: exon               (B) LOCATION:<152..832     -     (ix) FEATURE:               (A) NAME/KEY: exon               (B) LOCATION:892..>1761     -      (x) PUBLICATION INFORMATION:               (A) AUTHORS: Taleb, F     #A             Radford,               (B) TITLE: Cloning sequ - #encing and homologies of the                    CBH-1 (ex - #ocellobiohydrolase) gene of Neurospora                    crassa               (K) RELEVANT RESIDUES I - #N SEQ ID NO: 1: FROM 1 TO 1849     #1:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     - GAGTCTGTAA CCAAACTCTT TACCCGTCCT TGGGTCCCTG TAGCAGTATA TC - #CATTGTTT       60     - CTTATATAAA GGTTAGGGGG TAAATCCCGG CGCTCATGAC TTCGCCTTCT TC - #CCTTATCT      120     #CTC CTG GCC       172A TTGCACTCAA A ATG AGG GCC TCG     #                 Met - # Arg Ala Ser Leu Leu Ala     # 1               5     - TTC TCC CTC GCT GCC GCC GTG GCC GGC GGC CA - #G CAG GCC GGC ACT CTC      220     Phe Ser Leu Ala Ala Ala Val Ala Gly Gly Gl - #n Gln Ala Gly Thr Leu     #         20     - ACC GCC AAG AGG CAC CCA TCC CTC ACA TGG CA - #G AAG TGC ACC AGG GGG      268     Thr Ala Lys Arg His Pro Ser Leu Thr Trp Gl - #n Lys Cys Thr Arg Gly     #     35     - GGG TGC CCG ACC CTG AAC ACC ACG ATG GTG CT - #C GAC GCG AAC TGG CGC      316     Gly Cys Pro Thr Leu Asn Thr Thr Met Val Le - #u Asp Ala Asn Trp Arg     # 55     - TGG ACT CAC GCC ACG TCC GGC TCC ACG AAG TG - #C TAC ACG GGC AAC AAG      364     Trp Thr His Ala Thr Ser Gly Ser Thr Lys Cy - #s Tyr Thr Gly Asn Lys     #                 70     - TGG CAG GCG ACG CTC TGC CCC GAT GGC AAG TC - #G TGC GCG GCG AAC TGC      412     Trp Gln Ala Thr Leu Cys Pro Asp Gly Lys Se - #r Cys Ala Ala Asn Cys     #             85     - GCG CTG GAC GGC GCC GAC TAC ACC GGC ACC TA - #C GGG ATC ACC GGG AGC      460     Ala Leu Asp Gly Ala Asp Tyr Thr Gly Thr Ty - #r Gly Ile Thr Gly Ser     #        100     - GGC TGG TCC CTC ACG CTC CAG TTC GTC ACG GA - #C AAC GTC GGC GCC CGT      508     Gly Trp Ser Leu Thr Leu Gln Phe Val Thr As - #p Asn Val Gly Ala Arg     #   115     - GCC TAC CTG ATG GCG GAC GAC ACG CAG TAC CA - #G ATG TTG GAG CTC CTG      556     Ala Tyr Leu Met Ala Asp Asp Thr Gln Tyr Gl - #n Met Leu Glu Leu Leu     120                 1 - #25                 1 - #30                 1 -     #35     - AAC CAG GAG TTG TGG TTC GAC GTC GAT ATG TC - #G AAC ATC CCG TGC GGT      604     Asn Gln Glu Leu Trp Phe Asp Val Asp Met Se - #r Asn Ile Pro Cys Gly     #               150     - CTG AAC GGC GCC CTC TAC CTC TCG GCG ATG GA - #C GCG GAT GGG GGC ATG      652     Leu Asn Gly Ala Leu Tyr Leu Ser Ala Met As - #p Ala Asp Gly Gly Met     #           165     - AGG AAG TAC CCG ACC AAC AAG GCT GGC GCT AA - #G TAC GCT ACC GGT TAC      700     Arg Lys Tyr Pro Thr Asn Lys Ala Gly Ala Ly - #s Tyr Ala Thr Gly Tyr     #       180     - TGC GAC GCT CAG TGC CCC CGT GAT CTC AAG TA - #C ATC AAC GGT ATC GCC      748     Cys Asp Ala Gln Cys Pro Arg Asp Leu Lys Ty - #r Ile Asn Gly Ile Ala     #   195     - AAC GTT GAG GGC TGG ACC CCT TCC ACC AAC GA - #T GCT AAC GGT ATT GGT      796     Asn Val Glu Gly Trp Thr Pro Ser Thr Asn As - #p Ala Asn Gly Ile Gly     200                 2 - #05                 2 - #10                 2 -     #15     - GAC CAC GGA TCT TGC TGC TCT GAG ATG GAT AT - #C TGG GTTTGTTTGC      842     Asp His Gly Ser Cys Cys Ser Glu Met Asp Il - #e Trp     #               225     - CGATTTTCCT TTCATCATTA GCATCACAGG TAACTAACAC CCACCTAAG GAA - # GCG      897     #Glu Ala     - AAC AAA GTC TCT ACA GCG TTC ACC CCG CAC CC - #C TGC ACC ACC ATC GAA      945     Asn Lys Val Ser Thr Ala Phe Thr Pro His Pr - #o Cys Thr Thr Ile Glu     230                 2 - #35                 2 - #40                 2 -     #45     - CAG CAC ATG TGC GAG GGT GAC TCC TGC GGT GG - #T ACC TAT TCC GAC GAC      993     Gln His Met Cys Glu Gly Asp Ser Cys Gly Gl - #y Thr Tyr Ser Asp Asp     #               260     - CGC TAT GGC GTA CTT TGC GAT GCC GAT GGT TG - #T GAC TTC AAC AGC TAC     1041     Arg Tyr Gly Val Leu Cys Asp Ala Asp Gly Cy - #s Asp Phe Asn Ser Tyr     #           275     - CGC ATG GGC AAC ACC ACC TTC TAC GGT GAG GG - #C AAG ACT GTC GAT ACC     1089     Arg Met Gly Asn Thr Thr Phe Tyr Gly Glu Gl - #y Lys Thr Val Asp Thr     #       290     - AGC TCC AAG TTC ACC GTT GTC ACC CAG TTC AT - #C AAG GAC TCC GCT GGC     1137     Ser Ser Lys Phe Thr Val Val Thr Gln Phe Il - #e Lys Asp Ser Ala Gly     #   305     - GAT CTT GCT GAG ATC AAG GCC TTC TAC GTC CA - #G AAC GGA AAA GTC ATT     1185     Asp Leu Ala Glu Ile Lys Ala Phe Tyr Val Gl - #n Asn Gly Lys Val Ile     310                 3 - #15                 3 - #20                 3 -     #25     - GAG AAC TCT CAG TCC AAC GTT GAT GGA GTT TC - #T GGC AAC TCC ATC ACC     1233     Glu Asn Ser Gln Ser Asn Val Asp Gly Val Se - #r Gly Asn Ser Ile Thr     #               340     - CAG TCT TTC TGC AAG TCT CAG AAG ACT GCT TT - #C GGC GAT ATC GAT GAC     1281     Gln Ser Phe Cys Lys Ser Gln Lys Thr Ala Ph - #e Gly Asp Ile Asp Asp     #           355     - TTC AAC AAG AAG GGT GGC CTG AAG CAA ATG GG - #C AAG GCC CTT GCC CAA     1329     Phe Asn Lys Lys Gly Gly Leu Lys Gln Met Gl - #y Lys Ala Leu Ala Gln     #       370     - GCC ATG GTC CTC GTC ATG TCC ATC TGG GAC GA - #C CAT GCC GCC AAC ATG     1377     Ala Met Val Leu Val Met Ser Ile Trp Asp As - #p His Ala Ala Asn Met     #   385     - CTC TGG CTC GAC TCC ACC TAC CCT GTC CCG AA - #G GTC CCC GGT GCT TAC     1425     Leu Trp Leu Asp Ser Thr Tyr Pro Val Pro Ly - #s Val Pro Gly Ala Tyr     390                 3 - #95                 4 - #00                 4 -     #05     - CGT GGC AGT GGC CCT ACC ACC TCG GGT GTC CC - #A GCT GAG GTC GAC GCC     1473     Arg Gly Ser Gly Pro Thr Thr Ser Gly Val Pr - #o Ala Glu Val Asp Ala     #               420     - AAT GCT CCC AAC TCC AAG GTC GCC TTC TCC AA - #C ATC AAG TTC GGC CAC     1521     Asn Ala Pro Asn Ser Lys Val Ala Phe Ser As - #n Ile Lys Phe Gly His     #           435     - CTC GGG ATC TCT CCT TTT AGC GGC GGC TCT TC - #C GGC ACC CCT CCT TCC     1569     Leu Gly Ile Ser Pro Phe Ser Gly Gly Ser Se - #r Gly Thr Pro Pro Ser     #       450     - AAC CCT TCG AGC TCC GCA AGC CCG ACT TCC TC - #C ACT GCT AAG CCT TCT     1617     Asn Pro Ser Ser Ser Ala Ser Pro Thr Ser Se - #r Thr Ala Lys Pro Ser     #   465     - TCC ACC TCT ACT GCC TCC AAC CCC AGC GGT AC - #C GGT GCT GCT CAC TGG     1665     Ser Thr Ser Thr Ala Ser Asn Pro Ser Gly Th - #r Gly Ala Ala His Trp     470                 4 - #75                 4 - #80                 4 -     #85     - GCT CAG TGC GGT GGT ATT GGC TTC TCT GGC CC - #C ACC ACT TGC CCA GAG     1713     Ala Gln Cys Gly Gly Ile Gly Phe Ser Gly Pr - #o Thr Thr Cys Pro Glu     #               500     - CCC TAC ACT TGC GCA AAA GAT CAC GAC ATT TA - #C TCC CAG TGC GTG     1758     Pro Tyr Thr Cys Ala Lys Asp His Asp Ile Ty - #r Ser Gln Cys Val     #           515     - TAAATTACTA GCCTGCTAGG GTAACCTTTT TGGTTCCTCT ACTACGGCAG CT - #AGGTGAAC     1818     #        1849      AAGG AACTTCGAGA A     - (2) INFORMATION FOR SEQ ID NO: 2:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 516 amino               (B) TYPE: amino acid               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     #2:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     - Met Arg Ala Ser Leu Leu Ala Phe Ser Leu Al - #a Ala Ala Val Ala Gly     #                 15     - Gly Gln Gln Ala Gly Thr Leu Thr Ala Lys Ar - #g His Pro Ser Leu Thr     #             30     - Trp Gln Lys Cys Thr Arg Gly Gly Cys Pro Th - #r Leu Asn Thr Thr Met     #         45     - Val Leu Asp Ala Asn Trp Arg Trp Thr His Al - #a Thr Ser Gly Ser Thr     #     60     - Lys Cys Tyr Thr Gly Asn Lys Trp Gln Ala Th - #r Leu Cys Pro Asp Gly     # 80     - Lys Ser Cys Ala Ala Asn Cys Ala Leu Asp Gl - #y Ala Asp Tyr Thr Gly     #                 95     - Thr Tyr Gly Ile Thr Gly Ser Gly Trp Ser Le - #u Thr Leu Gln Phe Val     #           110     - Thr Asp Asn Val Gly Ala Arg Ala Tyr Leu Me - #t Ala Asp Asp Thr Gln     #       125     - Tyr Gln Met Leu Glu Leu Leu Asn Gln Glu Le - #u Trp Phe Asp Val Asp     #   140     - Met Ser Asn Ile Pro Cys Gly Leu Asn Gly Al - #a Leu Tyr Leu Ser Ala     145                 1 - #50                 1 - #55                 1 -     #60     - Met Asp Ala Asp Gly Gly Met Arg Lys Tyr Pr - #o Thr Asn Lys Ala Gly     #               175     - Ala Lys Tyr Ala Thr Gly Tyr Cys Asp Ala Gl - #n Cys Pro Arg Asp Leu     #           190     - Lys Tyr Ile Asn Gly Ile Ala Asn Val Glu Gl - #y Trp Thr Pro Ser Thr     #       205     - Asn Asp Ala Asn Gly Ile Gly Asp His Gly Se - #r Cys Cys Ser Glu Met     #   220     - Asp Ile Trp Glu Ala Asn Lys Val Ser Thr Al - #a Phe Thr Pro His Pro     225                 2 - #30                 2 - #35                 2 -     #40     - Cys Thr Thr Ile Glu Gln His Met Cys Glu Gl - #y Asp Ser Cys Gly Gly     #               255     - Thr Tyr Ser Asp Asp Arg Tyr Gly Val Leu Cy - #s Asp Ala Asp Gly Cys     #           270     - Asp Phe Asn Ser Tyr Arg Met Gly Asn Thr Th - #r Phe Tyr Gly Glu Gly     #       285     - Lys Thr Val Asp Thr Ser Ser Lys Phe Thr Va - #l Val Thr Gln Phe Ile     #   300     - Lys Asp Ser Ala Gly Asp Leu Ala Glu Ile Ly - #s Ala Phe Tyr Val Gln     305                 3 - #10                 3 - #15                 3 -     #20     - Asn Gly Lys Val Ile Glu Asn Ser Gln Ser As - #n Val Asp Gly Val Ser     #               335     - Gly Asn Ser Ile Thr Gln Ser Phe Cys Lys Se - #r Gln Lys Thr Ala Phe     #           350     - Gly Asp Ile Asp Asp Phe Asn Lys Lys Gly Gl - #y Leu Lys Gln Met Gly     #       365     - Lys Ala Leu Ala Gln Ala Met Val Leu Val Me - #t Ser Ile Trp Asp Asp     #   380     - His Ala Ala Asn Met Leu Trp Leu Asp Ser Th - #r Tyr Pro Val Pro Lys     385                 3 - #90                 3 - #95                 4 -     #00     - Val Pro Gly Ala Tyr Arg Gly Ser Gly Pro Th - #r Thr Ser Gly Val Pro     #               415     - Ala Glu Val Asp Ala Asn Ala Pro Asn Ser Ly - #s Val Ala Phe Ser Asn     #           430     - Ile Lys Phe Gly His Leu Gly Ile Ser Pro Ph - #e Ser Gly Gly Ser Ser     #       445     - Gly Thr Pro Pro Ser Asn Pro Ser Ser Ser Al - #a Ser Pro Thr Ser Ser     #   460     - Thr Ala Lys Pro Ser Ser Thr Ser Thr Ala Se - #r Asn Pro Ser Gly Thr     465                 4 - #70                 4 - #75                 4 -     #80     - Gly Ala Ala His Trp Ala Gln Cys Gly Gly Il - #e Gly Phe Ser Gly Pro     #               495     - Thr Thr Cys Pro Glu Pro Tyr Thr Cys Ala Ly - #s Asp His Asp Ile Tyr     #           510     - Ser Gln Cys Val             515     - (2) INFORMATION FOR SEQ ID NO:3:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 525 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -    (iii) HYPOTHETICAL: NO     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: H. grisea     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:     -      Met Arg Thr Ala Lys Phe Ala Thr - # Leu Ala Ala Leu Val Ala Ser     Ala     #   15     -      Ala Ala Gln Gln Ala Cys Ser Leu - # Thr Thr Glu Arg His Pro Ser     Leu     #                 30     -      Ser Trp Lys Lys Cys Thr Ala Gly - # Gly Gln Cys Gln Thr Val Gln     Ala     #             45     -      Ser Ile Thr Leu Asp Ser Asn Trp - # Arg Trp Thr His Gln Val Ser     Gly     #         60     -      Ser Thr Asn Cys Tyr Thr Gly Asn - # Lys Trp Asp Thr Ser Ile Cys     Thr     #     80     -      Asp Ala Lys Ser Cys Ala Gln Asn - # Cys Cys Val Asp Gly Ala Asp     Tyr     #   95     -      Thr Ser Thr Tyr Gly Ile Thr Thr - # Asn Gly Asp Ser Leu Ser Leu     Lys     #                110     -      Phe Val Thr Lys Gly Gln Tyr Ser - # Thr Asn Val Gly Ser Arg Thr     Tyr     #            125     -      Leu Met Asp Gly Glu Asp Lys Tyr - # Gln Thr Phe Glu Leu Leu Gly     Asn     #        140     -      Glu Phe Thr Phe Asp Val Asp Val - # Ser Asn Ile Gly Cys Gly Leu     Asn     #    160     -      Gly Ala Leu Tyr Phe Val Ser Met - # Asp Ala Asp Gly Gly Leu Ser     Arg     #   175     -      Tyr Pro Gly Asn Lys Ala Gly Ala - # Lys Tyr Gly Thr Gly Tyr Cys     Asp     #                190     -      Ala Gln Cys Pro Arg Asp Ile Lys - # Phe Ile Asn Gly Glu Ala Asn     Ile     #            205     -      Glu Gly Trp Thr Gly Ser Thr Asn - # Asp Pro Asn Ala Gly Ala Gly     Arg     #        220     -      Tyr Gly Thr Cys Cys Ser Glu Met - # Asp Ile Trp Glu Ala Asn Asn     Met     #    240     -      Ala Thr Ala Phe Thr Pro His Pro - # Cys Thr Ile Ile Gly Gln Ser     Arg     #   255     -      Cys Glu Gly Asp Ser Cys Gly Gly - # Thr Tyr Ser Asn Glu Arg Tyr     Ala     #                270     -      Gly Val Cys Asp Pro Asp Gly Cys - # Asp Phe Asn Ser Tyr Arg Gln     Gly     #            285     -      Asn Lys Thr Phe Tyr Gly Lys Gly - # Met Thr Val Asp Thr Thr Lys     Lys     #        300     -      Ile Thr Val Val Thr Gln Phe Leu - # Lys Asp Ala Asn Gly Asp Leu     Gly     #    320     -      Glu Ile Lys Arg Phe Tyr Val Gln - # Asp Gly Lys Ile Ile Pro Asn     Ser     #   335     -      Glu Ser Thr Ile Pro Gly Val Glu - # Gly Asn Ser Ile Thr Gln Asp     Trp     #                350     -      Cys Asp Arg Gln Lys Val Ala Phe - # Gly Asp Ile Asp Asp Phe Asn     Arg     #            365     -      Lys Gly Gly Met Lys Gln Met Gly - # Lys Ala Leu Ala Gly Pro Met     Val     #        380     -      Leu Val Met Ser Ile Trp Asp Asp - # His Ala Ser Asn Met Leu Trp     Leu     #    400     -      Asp Ser Thr Phe Pro Val Asp Ala - # Ala Gly Lys Pro Gly Ala Glu     Arg     #   415     -      Gly Ala Cys Pro Thr Thr Ser Gly - # Val Pro Ala Glu Val Glu Ala     Glu     #                430     -      Ala Pro Asn Ser Asn Val Val Phe - # Ser Asn Ile Arg Phe Gly Pro     Ile     #            445     -      Gly Ser Thr Val Ala Gly Leu Pro - # Gly Ala Gly Asn Gly Gly Asn     Asn     #        460     -      Gly Gly Asn Pro Pro Pro Pro Thr - # Thr Thr Thr Ser Ser Ala Pro     Ala     #    480     -      Thr Thr Thr Thr Ala Ser Ala Gly - # Pro Lys Ala Gly Arg Trp Gln     Gln     #   495     -      Cys Gly Gly Ile Gly Phe Thr Gly - # Pro Thr Gln Cys Glu Glu Pro     Tyr     #                510     -      Thr Cys Thr Lys Leu Asn Asp Trp - # Tyr Ser Gln Cys Leu     #            525     - (2) INFORMATION FOR SEQ ID NO:4:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 513 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -    (iii) HYPOTHETICAL: NO     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: T.reesei     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:     -      Met Tyr Arg Lys Leu Ala Val Ile - # Ser Ala Phe Leu Ala Thr Ala     Arg     #   15     -      Ala Gln Ser Ala Cys Thr Leu Gln - # Ser Glu Thr His Pro Pro Leu     Thr     #                 30     -      Trp Gln Lys Cys Ser Ser Gly Gly - # Thr Cys Thr Gln Gln Thr Gly     Ser     #             45     -      Val Val Ile Asp Ala Asn Trp Arg - # Trp Thr His Ala Thr Asn Ser     Ser     #         60     -      Thr Asn Cys Tyr Asp Gly Asn Thr - # Trp Ser Ser Thr Leu Cys Pro     Asp     #     80     -      Asn Glu Thr Cys Ala Lys Asn Cys - # Cys Leu Asp Gly Ala Ala Tyr     Ala     #   95     -      Ser Thr Tyr Gly Val Thr Thr Ser - # Gly Asn Ser Leu Ser Ile Gly     Phe     #                110     -      Val Thr Gln Ser Ala Gln Lys Asn - # Val Gly Ala Arg Leu Tyr Leu     Met     #            125     -      Ala Ser Asp Thr Thr Tyr Gln Glu - # Phe Thr Leu Leu Gly Asn Glu     Phe     #        140     -      Ser Phe Asp Val Asp Val Ser Gln - # Leu Pro Cys Gly Leu Asn Gly     Ala     #    160     -      Leu Tyr Phe Val Ser Met Asp Ala - # Asp Gly Gly Val Ser Lys Tyr     Pro     #   175     -      Thr Asn Thr Ala Gly Ala Lys Tyr - # Gly Thr Gly Tyr Cys Asp Ser     Gln     #                190     -      Cys Pro Arg Asp Leu Lys Phe Ile - # Asn Gly Gln Ala Asn Val Glu     Gly     #            205     -      Trp Glu Pro Ser Ser Asn Asn Ala - # Asn Thr Gly Ile Gly Gly His     Gly     #        220     -      Ser Cys Cys Ser Glu Met Asp Ile - # Trp Glu Ala Asn Ser Ile Ser     Glu     #    240     -      Ala Leu Thr Pro His Pro Cys Thr - # Thr Val Gly Gln Glu Ile Cys     Glu     #   255     -      Gly Asp Gly Cys Gly Gly Thr Tyr - # Ser Asp Asn Arg Tyr Gly Gly     Thr     #                270     -      Cys Asp Pro Asp Gly Cys Asp Trp - # Asn Pro Tyr Arg Leu Gly Asn     Thr     #            285     -      Ser Phe Tyr Gly Pro Gly Ser Ser - # Phe Thr Leu Asp Thr Thr Lys     Lys     #        300     -      Leu Thr Val Val Thr Gln Phe Glu - # Thr Ser Gly Ala Ile Asn Arg     Tyr     #    320     -      Tyr Val Gln Asn Gly Val Thr Phe - # Gln Gln Pro Asn Ala Glu Leu     Gly     #   335     -      Ser Tyr Ser Gly Asn Glu Leu Asn - # Asp Asp Tyr Cys Thr Ala Glu     Glu     #                350     -      Ala Glu Phe Gly Gly Ser Ser Phe - # Ser Asp Lys Gly Gly Leu Thr     Gln     #            365     -      Phe Lys Lys Ala Thr Ser Gly Gly - # Met Val Leu Val Met Ser Leu     Trp     #        380     -      Asp Asp Tyr Tyr Ala Asn Met Leu - # Trp Leu Asp Ser Thr Tyr Pro     Thr     #    400     -      Asn Glu Thr Ser Ser Thr Pro Gly - # Ala Val Arg Gly Ser Cys Ser     Thr     #   415     -      Ser Ser Gly Val Pro Ala Gln Val - # Glu Ser Gln Ser Pro Asn Ala     Lys     #                430     -      Val Thr Phe Ser Asn Ile Lys Phe - # Gly Pro Ile Gly Ser Thr Gly     Asn     #            445     -      Pro Ser Gly Gly Asn Pro Pro Gly - # Gly Asn Arg Gly Thr Thr Thr     Thr     #        460     -      Arg Arg Pro Ala Thr Thr Thr Gly - # Ser Ser Pro Gly Pro Thr Gln     Ser     #    480     -      His Tyr Gly Gln Cys Gly Gly Ile - # Gly Tyr Ser Gly Pro Thr Val     Cys     #   495     -      Ala Ser Gly Thr Thr Cys Gln Val - # Leu Asn Pro Tyr Tyr Ser Gln     Cys     #                510     -      Leu     - (2) INFORMATION FOR SEQ ID NO:5:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 513 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -    (iii) HYPOTHETICAL: NO     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: T. viride     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:     -      Met Tyr Gln Lys Leu Ala Leu Ile - # Ser Ala Phe Leu Ala Thr Ala     Arg     #   15     -      Ala Gln Ser Ala Cys Thr Leu Gln - # Ala Glu Thr His Pro Pro Leu     Thr     #                 30     -      Trp Gln Lys Cys Ser Ser Gly Gly - # Thr Cys Thr Gln Gln Thr Gly     Ser     #             45     -      Val Val Ile Asp Ala Asn Trp Arg - # Trp Thr His Ala Thr Asn Ser     Ser     #         60     -      Thr Asn Cys Tyr Asp Gly Asn Thr - # Trp Ser Ser Thr Leu Cys Pro     Asp     #     80     -      Asn Glu Thr Cys Ala Lys Asn Cys - # Cys Leu Asp Gly Ala Ala Tyr     Ala     #   95     -      Ser Thr Tyr Gly Val Thr Thr Ser - # Ala Asp Ser Leu Ser Ile Gly     Phe     #                110     -      Val Thr Gln Ser Ala Gln Lys Asn - # Val Gly Ala Arg Leu Tyr Leu     Met     #            125     -      Ala Ser Asp Thr Thr Tyr Gln Glu - # Phe Thr Leu Leu Gly Asn Glu     Phe     #        140     -      Ser Phe Asp Val Asp Val Ser Gln - # Leu Pro Cys Gly Leu Asn Gly     Ala     #    160     -      Leu Tyr Phe Val Ser Met Asp Ala - # Asp Gly Gly Val Thr Lys Tyr     Pro     #   175     -      Thr Asn Thr Ala Gly Ala Lys Tyr - # Gly Thr Gly Tyr Cys Asp Ser     Gln     #                190     -      Cys Pro Arg Asp Leu Lys Phe Ile - # Asn Gly Gln Ala Asn Val Glu     Gly     #            205     -      Trp Glu Pro Ser Ser Asn Asn Ala - # Asn Thr Gly Ile Gly Gly His     Gly     #        220     -      Ser Cys Cys Ser Glu Met Asp Ile - # Trp Glu Ala Asn Ser Ile Ser     Glu     #    240     -      Ala Leu Thr Pro His Pro Cys Thr - # Thr Val Gly Gln Glu Ile Cys     Glu     #   255     -      Gly Asp Ser Cys Gly Gly Thr Tyr - # Ser Gly Asp Arg Tyr Gly Gly     Thr     #                270     -      Cys Asp Pro Asp Gly Cys Asp Trp - # Asn Pro Tyr Arg Leu Gly Asn     Thr     #            285     -      Ser Phe Tyr Gly Pro Gly Ser Ser - # Phe Thr Leu Asp Thr Thr Lys     Lys     #        300     -      Leu Thr Val Val Thr Gln Phe Glu - # Thr Ser Gly Ala Ile Asn Arg     Tyr     #    320     -      Tyr Val Gln Asn Gly Val Thr Phe - # Gln Gln Pro Asn Ala Glu Leu     Gly     #   335     -      Asp Tyr Ser Gly Asn Ser Leu Asp - # Asp Asp Tyr Cys Ala Ala Glu     Glu     #                350     -      Ala Glu Phe Gly Gly Ser Ser Phe - # Ser Asp Lys Gly Gly Leu Thr     Gln     #            365     -      Phe Lys Lys Ala Thr Ser Gly Gly - # Met Val Leu Val Met Ser Leu     Trp     #        380     -      Asp Asp Tyr Tyr Ala Asn Met Leu - # Trp Leu Asp Ser Thr Tyr Pro     Thr     #    400     -      Asp Glu Thr Ser Ser Thr Pro Gly - # Ala Val Arg Gly Ser Ser Ser     Thr     #   415     -      Ser Ser Gly Val Pro Ala Gln Leu - # Glu Ser Asn Ser Pro Asn Ala     Lys     #                430     -      Val Val Tyr Ser Asn Ile Lys Phe - # Gly Pro Ile Gly Ser Thr Gly     Asn     #            445     -      Pro Ser Gly Gly Asn Pro Pro Gly - # Gly Asn Pro Pro Gly Thr Thr     Thr     #        460     -      Pro Arg Pro Ala Thr Ser Thr Gly - # Ser Ser Pro Gly Pro Thr Gln     Thr     #    480     -      His Tyr Gly Gln Cys Gly Gly Ile - # Gly Tyr Ile Gly Pro Thr Val     Cys     #   495     -      Ala Ser Gly Ser Thr Cys Gln Val - # Leu Asn Pro Tyr Tyr Ser Gln     Cys     #                510     -      Leu     - (2) INFORMATION FOR SEQ ID NO:6:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 459 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -    (iii) HYPOTHETICAL: NO     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: T. reesei     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:     -      Met Ala Pro Ser Val Thr Leu Pro - # Leu Thr Thr Ala Ile Leu Ala     Ile     #   15     -      Ala Arg Leu Val Ala Ala Gln Gln - # Pro Gly Thr Ser Thr Pro Glu     Val     #                 30     -      His Pro Lys Leu Thr Thr Tyr Lys - # Cys Thr Lys Ser Gly Gly Cys     Val     #             45     -      Ala Gln Asp Thr Ser Val Val Leu - # Asp Trp Asn Tyr Arg Trp Met     His     #         60     -      Asp Ala Asn Tyr Asn Ser Cys Thr - # Val Asn Gly Gly Val Asn Thr     Thr     #     80     -      Leu Cys Pro Asp Glu Ala Thr Cys - # Gly Lys Asn Cys Phe Ile Glu     Gly     #   95     -      Val Asp Tyr Ala Ala Ser Gly Val - # Thr Thr Ser Gly Ser Ser Leu     Thr     #                110     -      Met Asn Gln Tyr Met Pro Ser Ser - # Ser Gly Gly Tyr Ser Ser Val     Ser     #            125     -      Pro Arg Leu Tyr Leu Leu Asp Ser - # Asp Gly Glu Tyr Val Met Leu     Lys     #        140     -      Leu Asn Gly Gln Glu Leu Ser Phe - # Asp Val Asp Leu Ser Ala Leu     Pro     #    160     -      Cys Gly Glu Asn Gly Ser Leu Tyr - # Leu Ser Gln Met Asp Glu Asn     Gly     #   175     -      Gly Ala Asn Gln Tyr Asn Thr Ala - # Gly Ala Asn Tyr Gly Ser Gly     Tyr     #                190     -      Cys Asp Ala Gln Cys Pro Val Gln - # Thr Trp Arg Asn Gly Thr Leu     Asn     #            205     -      Thr Ser His Gln Gly Phe Cys Cys - # Asn Glu Met Asp Ile Leu Glu     Gly     #        220     -      Asn Ser Arg Ala Asn Ala Leu Thr - # Pro His Ser Cys Thr Ala Thr     Ala     #    240     -      Cys Asp Ser Ala Gly Cys Gly Phe - # Asn Pro Tyr Gly Ser Gly Tyr     Lys     #   255     -      Ser Tyr Tyr Gly Pro Gly Asp Thr - # Val Asp Thr Ser Lys Thr Phe     Thr     #                270     -      Ile Ile Thr Gln Phe Asn Thr Asp - # Asn Gly Ser Pro Ser Gly Asn     Leu     #            285     -      Val Ser Ile Thr Arg Lys Tyr Gln - # Gln Asn Gly Val Asp Ile Pro     Ser     #        300     -      Ala Gln Pro Gly Gly Asp Thr Ile - # Ser Ser Cys Pro Ser Ala Ser     Ala     #    320     -      Tyr Gly Gly Leu Ala Thr Met Gly - # Lys Ala Leu Ser Ser Gly Met     Val     #   335     -      Leu Val Phe Ser Ile Trp Asn Asp - # Asn Ser Gln Tyr Met Asn Trp     Leu     #                350     -      Asp Ser Gly Asn Ala Gly Pro Cys - # Ser Ser Thr Glu Gly Asn Pro     Ser     #            365     -      Asn Ile Leu Ala Asn Asn Pro Asn - # Thr His Val Val Phe Ser Asn     Ile     #        380     -      Arg Trp Gly Asp Ile Gly Ser Thr - # Thr Asn Ser Thr Ala Pro Pro     Pro     #    400     -      Pro Pro Ala Ser Ser Thr Thr Phe - # Ser Thr Thr Arg Arg Ser Ser     Thr     #   415     -      Thr Ser Ser Ser Pro Ser Cys Thr - # Gln Thr His Trp Gly Gln Cys     Gly     #                430     -      Gly Ile Gly Tyr Ser Gly Cys Lys - # Thr Cys Thr Ser Gly Thr Thr     Cys     #            445     -      Gln Tyr Ser Asn Asp Tyr Tyr Ser - # Gln Cys Leu     #        455     - (2) INFORMATION FOR SEQ ID NO:7:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 516 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -    (iii) HYPOTHETICAL: NO     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: P. chryso - #sporium     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:     -      Met Phe Arg Thr Ala Thr Leu Leu - # Ala Phe Thr Met Ala Ala Met     Val     #   15     -      Phe Gly Gln Gln Val Gly Thr Asn - # Thr Ala Glu Asn His Arg Thr     Leu     #                 30     -      Thr Ser Gln Lys Cys Thr Lys Ser - # Gly Gly Cys Ser Asn Leu Asn     Thr     #             45     -      Lys Ile Val Leu Asp Ala Asn Trp - # Arg Trp Leu His Ser Thr Ser     Gly     #         60     -      Tyr Thr Asn Cys Tyr Thr Gly Asn - # Gln Trp Asp Ala Thr Leu Cys     Pro     #     80     -      Asp Gly Lys Thr Cys Ala Ala Asn - # Cys Ala Leu Asp Gly Ala Asp     Tyr     #   95     -      Thr Gly Thr Tyr Gly Ile Thr Ala - # Ser Gly Ser Ser Leu Lys Leu     Gln     #                110     -      Phe Val Thr Gly Ser Asn Val Gly - # Ser Arg Val Tyr Leu Met Ala     Asp     #            125     -      Asp Thr His Tyr Gln Met Phe Gln - # Leu Leu Asn Gln Glu Phe Thr     Phe     #        140     -      Asp Val Asp Met Ser Asn Leu Pro - # Cys Gly Leu Asn Gly Ala Leu     Tyr     #    160     -      Leu Ser Ala Met Asp Ala Asp Gly - # Gly Met Ala Lys Tyr Pro Thr     Asn     #   175     -      Lys Ala Gly Ala Lys Tyr Gly Thr - # Gly Tyr Cys Asp Ser Gln Cys     Pro     #                190     -      Arg Asp Ile Lys Phe Ile Asn Gly - # Glu Ala Asn Val Glu Gly Trp     Asn     #            205     -      Ala Thr Ser Ala Asn Ala Gly Thr - # Gly Asn Tyr Gly Thr Cys Cys     Thr     #        220     -      Glu Met Asp Ile Trp Glu Ala Asn - # Asn Asp Ala Ala Ala Tyr Thr     Pro     #    240     -      His Pro Cys Thr Thr Asn Ala Gln - # Thr Arg Cys Ser Gly Ser Asp     Cys     #   255     -      Thr Arg Asp Thr Gly Leu Cys Asp - # Ala Asp Gly Cys Asp Phe Asn     Ser     #                270     -      Phe Arg Met Gly Asp Gln Thr Phe - # Leu Gly Lys Gly Leu Thr Val     Asp     #            285     -      Thr Ser Lys Pro Phe Thr Val Val - # Thr Gln Phe Ile Thr Asn Asp     Gly     #        300     -      Thr Ser Ala Gly Thr Leu Thr Glu - # Ile Arg Arg Leu Tyr Val Gln     Asn     #    320     -      Gly Lys Val Ile Gln Asn Ser Ser - # Val Lys Ile Pro Gly Ile Asp     Leu     #   335     -      Val Asn Ser Ile Thr Asp Asn Phe - # Cys Ser Gln Gln Lys Thr Ala     Phe     #                350     -      Gly Asp Thr Asn Tyr Phe Ala Gln - # His Gly Gly Leu Lys Gln Val     Gly     #            365     -      Glu Ala Leu Arg Thr Gly Met Val - # Leu Ala Leu Ser Ile Trp Asp     Asp     #        380     -      Tyr Ala Ala Asn Met Leu Trp Leu - # Asp Ser Asn Tyr Pro Thr Asn     Lys     #    400     -      Asp Pro Ser Thr Pro Gly Val Ala - # Arg Gly Thr Cys Ala Thr Thr     Ser     #   415     -      Gly Val Pro Ala Gln Ile Glu Ala - # Gln Ser Pro Asn Ala Tyr Val     Val     #                430     -      Phe Ser Asn Ile Lys Phe Gly Asp - # Leu Asn Thr Thr Tyr Thr Gly     Thr     #            445     -      Val Ser Ser Ser Ser Val Ser Ser - # Ser His Ser Ser Thr Ser Thr     Ser     #        460     -      Ser Ser His Ser Ser Ser Ser Thr - # Pro Pro Thr Gln Pro Thr Gly     Val     #    480     -      Thr Val Pro Gln Trp Gly Gln Cys - # Gly Gly Ile Gly Tyr Thr Gly     Ser     #   495     -      Thr Thr Cys Ala Ser Pro Tyr Thr - # Cys His Val Leu Asn Pro Tyr     Tyr     #                510     -      Ser Gln Cys Tyr                  515     __________________________________________________________________________ 

We claim:
 1. A method of purifying a fusion protein containing a heterologous protein comprising:(a) transforming a host cell with an expression cassette encoding a fusion protein, wherein the fusion protein contains a C-terminal cellulose binding domain of a Neurospora crassa cellobiohydrolase-1, a linking region, and a heterologous protein; wherein the C-terminal cellulose binding domain of cellobiohydrolase-1 and the heterologous protein are linked together by the linking region; and wherein the host cell expresses the fusion protein; (b) contacting the fusion protein with a cellulose matrix; wherein the fusion protein binds to the cellulose matrix; (c) washing the cellulose matrix; and (d) eluting the fusion protein from the cellulose matrix; wherein the fusion protein is purified.
 2. A method of purifying a heterologous protein comprising:(a) transforming a host cell with an expression cassette encoding a fusion protein wherein the fusion protein contains a C-terminal cellulose binding domain of a Neurospora crassa cellobiohydrolase-1, a linking region, and a heterologous protein; wherein the C-terminal cellulose binding domain of cellobiohydrolase-1 and the heterologous protein are linked together by the linking region; wherein the linking region contains a specific proteolytic cleavage site; and wherein the host cell expresses the fusion protein; (b) contacting the fusion protein with a cellulose matrix; wherein the fusion protein binds to the cellulose matrix; (c) washing the cellulose matrix; and (d) contacting the specific proteolytic cleavage site with a proteolytic protein that specifically recognizes the specific proteolytic cleavage site, and therein releases the heterologous protein from the cellulose matrix; wherein the heterologous protein is purified.
 3. A method of immobilizing a heterologous protein on a cellulose matrix comprising contacting a fusion protein containing a C-terminal cellulose binding domain of a Neurospora crassa cellobiohydrolase-1, and a heterologous protein with a cellulose matrix; wherein the heterologous protein is immobilized on the cellulose matrix.
 4. A method of using the immobilized heterologous protein of claim 3 as a biocatalyst comprising:(a) contacting a substrate for the heterologous protein with the immobilized heterologous protein; and (b) collecting the product. 